Using an audio interface device to authenticate another device

ABSTRACT

Disclosed are various embodiments for using an audio interface device to facilitate authentication for other devices. A client device presents an authentication code via an output device of the client device. The authentication code is received from a voice interface device. The voice interface device is in an authenticated state for access to an account, and the voice interface device received the authentication code from speech captured by a microphone of the voice interface device following a spoken wake word. The client device is authenticated for access to the account in response to determining that the authentication code received from the voice interface device matches the authentication code presented by the client device.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a continuation of co-pending U.S. utilityapplication entitled, “Using an Audio Interface Device to AuthenticateAnother Device,” having application Ser. No 15/665,327, filed Jul. 31,2017, which is entirely incorporated herein by reference.

BACKGROUND

Users may have to authenticate themselves in order to access networkedresources via a computing device. The authentication may be necessary inorder to access secured resources such as subscription-based content orto access stored preferences or receive personalized content. Users maybe asked to establish a new account or to log in via an account with afederated identity provider. Authentication via a “living room” devicesuch as a television or a set-top box may be cumbersome if an on-screenkeyboard is required to enter a username, a password, answers toknowledge-based questions, or other security credentials.

BRIEF DESCRIPTION OF THE DRAWINGS

Many aspects of the present disclosure can be better understood withreference to the following drawings. The components in the drawings arenot necessarily to scale, with emphasis instead being placed uponclearly illustrating the principles of the disclosure. Moreover, in thedrawings, like reference numerals designate corresponding partsthroughout the several views.

FIGS. 1A-1C are drawings of example scenarios involving authenticationof a device using an audio interface device according to variousembodiments of the present disclosure.

FIG. 2 is a schematic block diagram of a networked environment accordingto various embodiments of the present disclosure.

FIGS. 3 and 4 are flowcharts illustrating examples of functionalityimplemented as portions of an authentication service executed in acomputing environment in the networked environment of FIG. 2 accordingto various embodiments of the present disclosure.

FIG. 5 is a flowchart illustrating one example of functionalityimplemented as portions of a client device in the networked environmentof FIG. 2 according to various embodiments of the present disclosure.

FIG. 6 is a flowchart illustrating one example of functionalityimplemented as portions of an audio interface device in the networkedenvironment of FIG. 2 according to various embodiments of the presentdisclosure.

FIG. 7 is a schematic block diagram that provides one exampleillustration of a computing environment employed in the networkedenvironment of FIG. 2 according to various embodiments of the presentdisclosure.

DETAILED DESCRIPTION

The present disclosure relates to using an audio interface device inorder to authenticate another device, such as a television or a set-topbox. Authenticating so-called “living room” devices or headless devicescan be difficult as they may lack common user input devices such askeyboards or microphones. For example, a television may prompt a user tokey in an email address and a password through an on-screen keyboard.However, the user may have to use a remote control having a limitednumber of buttons (e.g., arrow buttons and an enter button) tomanipulate the on-screen keyboard. This can take time and inducefrustration, particularly when entering long usernames or passwords thatmay include mixed cases, numbers, and special characters.

Other approaches may involve code-based linking. With code-basedlinking, a randomized code may be shown on the display, and the user maybe prompted to enter the code using another authenticated device, suchas a web browser or special-purpose application on a tablet, smartphone,laptop, desktop, or other device with additional input capabilities.Nonetheless, it may be cumbersome for a user to transfer the displayedcode to an authenticated device.

Various embodiments of the present disclosure provide approaches forusing an audio interface device to facilitate code-based linking. In oneexample, a user can simply speak the code to the audio interface device,when then causes the device that displayed the code to becomeauthenticated. In another example, the device requesting authenticationcan transmit the code to nearby devices—by audio, video, or othersignals—and the audio interface device or other limited capabilitydevice can receive the code signal and cause the device requestingauthentication to become authenticated. In addition to authenticating adevice, the approaches described herein may be used to authorize pendingtransactions for the device.

Turning now to FIG. 1A, shown is a drawing of an example scenario 100 ain which a television 101 is authenticated using an audio interfacedevice 102. To begin, the television 101, which in this example lacks atouchscreen or a keyboard, presents an opportunity for a user to log into an existing account with a form-based interface at screen 103. Uponnavigating to one of the form fields for username or password, thetelevision 101 may show an on-screen keyboard and allow the user to fillin text using buttons on a remote control. Here, an alternative is shownthrough the button labeled “Register Using Alexandra.” This refers toregistering the television using an existing account with an identityprovider (here, “Alexandra”), which may be a third party providingidentity federation.

Upon selecting the button, the television 101 next renders the screen106 that presents an authentication code 107. The authentication code107 may be a randomized or unique code that identifies an authenticationrequest for the television for a certain time window of validity. Inthis example, the authentication code 107 is “GA99SA,” and the user isinstructed to interact with his or her audio interface device 102 tosupply the authentication code 107 to the audio interface device 102.

The authentication code 107 can be generated according to a variety ofapproaches, including those described in U.S. Pat. No. 9,606,983,entitled “HUMAN READABLE MECHANISM FOR COMMUNICATING BINARY DATA,” andissued on Mar. 28, 2017, which is incorporated herein by reference inits entirety. This patent describes techniques for communicating abinary string such as an authentication code 107. A dictionary is seededwith multiple word sets (e.g., a set of nouns and a set of adjectives)and symbols are then created by combining words from the set of words.For example, adjective-noun pairs are created by combining one word fromeach set to create a symbol. A mapping of symbols to correspondingbinary values is generated to translate bit values to symbols andsymbols to bit values.

The user next interacts with the audio interface device 102 to supplythe authentication code 107. The user first says a wakeword—“Alexandra!”—and then says a command “Register me with,” followedby the authentication code 107—“G-A-9-9-S-A.” The audio interface device102 receives the spoken authentication code 107 from the user andreports it back to an authentication service, which verifies the at thespoken authentication code 107 matches the authentication code 107presented by the television 101. As an additional authentication factor,the audio interface device 102 or the authentication service can beconfigured to verify that the authentication code 107 is spoken by anauthorized user, e.g., with voice profiling. If the spokenauthentication code 107 does not match the authorized user's voice, theaccess may be denied, even if the authentication code 107 is correct.

The audio interface device 102 then reports back to the user that “Yourtelevision is now registered to your account.” The television 101 nextrenders a confirmation screen 109 indicating that the television 101 isnow registered using a certain account with the identity provider,“JohnSmith123.” Subsequently, the user can interact with the television101 to access secured content, personalizations, or other resourcesassociated with the “JohnSmith123” account.

Continuing to FIG. 1B, shown is a drawing of another example scenario100 b in which a television 101 is authenticated using an audiointerface device 102. In this scenario 100 b, the user does not need torepeat an authentication code 107 (FIG. 1A), as the authentication code107 is communicated directly from the television 101 to the audiointerface device 102. Although the same or similar initial screen 103may be shown, the subsequent screen 112 indicates that a registrationprocess is being performed. The television 101 then emits audio thatencodes an authentication code 107, which is represented here by“Hissssss! Whirr! Pop!” In other examples, the audio may be in anultrasonic frequency and not normally audible by humans. In still otherexamples, the audio may include music, such as a preferred musicassociated with the user's account, where different user accounts may beassociated with different music. Such music may include audible orinaudible watermarking to encode the authentication code 107.

The audio interface device 102 is in a listening mode, and as such,picks up the audio via a microphone and interprets the audio as theauthentication code 107. The audio interface device 102 may send theauthentication code 107 or the audio over a network to an authenticationservice for processing. As shown, the audio interface device 102 mayprompt the user to confirm authentication of the television. In thisexample, the audio interface device 102 informs the user via audio that“a television is seeking to access your account. Do you wish toapprove?” The user follows with an approval of “Yes,” and the audiointerface device 102 interacts with the authentication service toauthenticate and register the television 101 for access to accountresources. In some cases, the audio interface device 102 can beconfigured to verify that the approval is spoken by an authorized user,e.g., with voice profiling. If the approval does not match theauthorized user's voice, the access may be denied. At the screen 115,the television 101 reports a confirmation that indicating that thetelevision 101 is now registered using a certain account with theidentity provider, “JohnSmith123.”

Moving on to FIG. 1C, shown is a drawing of another example scenario 100b in which a pending transaction for the television 101 is authenticatedusing an audio interface device 102. At screen 118, the television 101informs the user that a movie may be purchased by repeating a certainphrase (corresponding to an authentication code 107 (FIG. 1A)) to anaudio interface device 102. The user then wakes the audio interfacedevice 102 (“Alexandra!”) and proceeds to say the authentication code107 (“Bean—Orange—Car—Telegram—Berry”). The audio interface device 102receives this authentication code 107 via a microphone and sends it toan authentication service. The authentication service verifies theauthentication code 107 and causes the audio interface device 102 toreport back that the transaction has been authorized. Subsequently, atscreen 121, the television 101 reports to the user that the purchase iscomplete and the account associated with the audio interface device 102(“JohnSmith123”) has been charged.

As an additional authentication factor, the audio interface device 102can be configured to verify that the authentication code 107 is spokenby an authorized user, e.g., with voice profiling. If the spokenauthentication code 107 does not match the authorized user's voice, theaccess may be denied, even if the authentication code 107 is correct. Inone embodiment, the set of authorized users may be dynamicallydetermined for a given area. For example, all users logged into anapplication on their smartphones that report a location within athreshold proximity of the audio interface device 102 may be consideredauthorized users, where an authorization by one of the set of userscould cause that respective user's account to be charged or otherwise tohave the transaction performed relative to that respective user'saccount.

Similarly, transaction authorization may be performed in the context ofa drive through window, where the audio interface device 102 inside orintegrated into a motor vehicle. For example, purchase identifyinginformation may be transmitted to the audio interface device 102 from aspeaker or other output device of the drive through. User approval maybe solicited in order to approve a transaction, which may includeplacing an order as well as possibly charging a payment instrumentassociated with the user's account. In one embodiment, the purchaseidentifying information including contents of a proposed order may betransmitted to the audio interface device 102 and presented to the uservia a display or audibly through a speaker. Through the authenticationprinciples of the present disclosure, the computing systems of the drivethrough may be able to identify the user and leverage userpersonalization, such as recommendations based on the user's priororders.

In the following discussion, a general description of the system and itscomponents is provided, followed by a discussion of the operation of thesame.

With reference to FIG. 2, shown is a networked environment 200 accordingto various embodiments. The networked environment 200 includes acomputing environment 203, one or more client devices 206, and one ormore audio interface devices 102, which are in data communication witheach other via a network 209. The network 209 includes, for example, theInternet, intranets, extranets, wide area networks (WANs), local areanetworks (LANs), wired networks, wireless networks, cable networks,satellite networks, or other suitable networks, etc., or any combinationof two or more such networks.

The computing environment 203 may comprise, for example, a servercomputer or any other system providing computing capability.Alternatively, the computing environment 203 may employ a plurality ofcomputing devices that may be arranged, for example, in one or moreserver banks or computer banks or other arrangements. Such computingdevices may be located in a single installation or may be distributedamong many different geographical locations. For example, the computingenvironment 203 may include a plurality of computing devices thattogether may comprise a hosted computing resource, a grid computingresource, and/or any other distributed computing arrangement. In somecases, the computing environment 203 may correspond to an elasticcomputing resource where the allotted capacity of processing, network,storage, or other computing-related resources may vary over time.

Various applications and/or other functionality may be executed in thecomputing environment 203 according to various embodiments. Also,various data is stored in a data store 215 that is accessible to thecomputing environment 203. The data store 215 may be representative of aplurality of data stores 215 as can be appreciated. The data stored inthe data store 215, for example, is associated with the operation of thevarious applications and/or functional entities described below.

The components executed on the computing environment 203, for example,include an authentication service 218 and other applications, services,processes, systems, engines, or functionality not discussed in detailherein. The authentication service 218 is executed to authenticateclient devices 206 and audio interface devices 102 for access to useraccounts. The user accounts may provide access to secured resources orother network services. As will be described, the authentication service218 can leverage authentication of an audio interface device 102 tobootstrap or assist authentication of client devices 206 that may lackinput or output devices, such as touchscreens, keyboards, or audiointerfaces. The authentication service 218 may similarly facilitateapproval of pending transactions being performed by the client devices206 using the audio interface device 102.

The data stored in the data store 215 includes, for example, accountdata 221, authentication codes 107, authentication code generation rules222, risk-based factors 223, and potentially other data. The accountdata 221 includes data associated with user accounts for networkservices. In some situations, the operator of the authentication service218 may provide the network services. Otherwise, the operator of theauthentication service 218 may provide a federated identity service thatmay be used by third parties to authenticate access to network servicesoperated by the third parties. The account data 221 may include securitycredentials 224, access tokens 227, secured resources 230,authentication rules 233, voice recognition profiles 234, one or morelocations 235, and/or other data.

The security credentials 224 may include usernames, passwords, answersto knowledge-based questions, keys, biometric profiles, personalidentification numbers, and/or other long-lived credentials used toauthorize or authenticate access to an account. The security credentials224 are long-lived in the sense that they may persist for a relativelylong period of time (e.g., ninety days) until they are required to bechanged, or perhaps indefinitely until changed by the user.Authentication for access to an account may require a combination of oneor more security credentials 224.

The account data 221 can also include access tokens 227 that are used toprovide client devices 206 with access to the account. For example, oncea client device 206 is authenticated, an access token 227 can be issuedto the client device 206 that is used by the client device 206 to accessa network service that requires authentication. The access tokens 227may be long-lived or short-lived. For example, an access token 227 thatis a registration token may be valid indefinitely, while an access token227 that is a session token may be valid for a short time period (e.g.,one hour, or until an application is exited).

The secured resources 230 may correspond to protected data of an accountthat a client device 206 that is authenticated for the account canaccess. To this end, the secured resources 230 may include protectedcontent, licenses, encryption keys, preferences, personalizations,interaction histories, transaction histories, and so forth. Theauthentication rules 233 may configure how client devices 206 areauthenticated and under what condition transactions may be authorized orapproved. For example, the authentication rule 233 may enable a parentalcontrol system that requires additional verification in order tocomplete a transaction or perform authentication. The voice recognitionprofiles 234 may include data that profiles a user's voice so as toenable speaker identification or recognition. The locations 235 mayindicate locations where a user may be a potential speaker or authorizerof transactions. Such locations 235 may be reported by the audiointerface device 102 or other client devices 206 based on networkaddress geolocation, network access point or cell tower locationfinding, global positioning system (GPS) coordinates, and/or otherapproaches.

The authentication codes 107 include randomized or unique codes that aregenerated to facilitate authentication or authorization of a clientdevice 206 by an audio interface device 102. The authentication code 107can include a string of letters, numbers, and special characters, and/orwords and phrases. In some cases, the authentication code 107 may bebinary data. Each authentication code 107 can be associated with asession identifier 236 that uniquely identifies a client device 206 thathas requested authentication. An authentication code 107 can also beassociated with a time window 239 for validity, where the authenticationcode 107 may be valid only if presented within the time window 239. Thetime window 239 may be selected to be relatively brief (e.g., thirtyseconds) to minimize the chance that an authentication code 107 could becompromised or reused.

The authentication code generation rules 222 control the generation ofauthentication codes 107 by the authentication service 218 or by theclient devices 206. The authentication code generation rules 222 mayspecify a required level of entropy, or complexity, for authenticationcodes 107. For example, entropy may be increased by requiring morecharacters or using a larger character set. Where the authenticationcode 107 is a word or phrase, entropy can be increased by using a largerdictionary or by using a greater number of words. Authentication codes107 with lower levels of entropy may be susceptible to brute forcecompromises. In one embodiment, an authentication code 107 maycorrespond to a security assertion signed by a key in a public/privatekey pair. Where the authentication code 107 is generated for a user tospeak, the authentication code generation rules 222 may dictate thatcertain characters with similar pronunciation in a given dialect orlanguage may not be used. For example, it may be that a “b” may beindistinguishable from a “v” for many speakers of a given dialect, andthe authentication code generation rules 222 may specify that both “b”and “v” are to be avoided. However, it is noted that this may depend ona specified language or dialect in use by users of the audio interfacedevice 102.

The risk-based factors 223 include factors that that may lead theauthentication service 218 to preclude authentication or authorizationof a transaction independently from whether the audio interface device102 is authenticated or if a correct authentication code 107 has beenprovided. For instance, certain geographies may be less trusted, or maynot be associated with the account data 221 as a location 235. Also,unusual or atypical activity associated with an account may be a factorindicating risk.

The client device 206 is representative of a plurality of client devicesthat may be coupled to the network 209. The client device 206 maycomprise, for example, a processor-based system such as a computersystem. Such a computer system may be embodied in the form of a desktopcomputer, a laptop computer, personal digital assistants, cellulartelephones, smartphones, set-top boxes, music players, web pads, tabletcomputer systems, game consoles, electronic book readers, smartwatches,head mounted displays, voice interface devices, or other devices. Theclient device 206 may include a display 242. The display 242 maycomprise, for example, one or more devices such as liquid crystaldisplay (LCD) displays, gas plasma-based flat panel displays, organiclight emitting diode (OLED) displays, electrophoretic ink (E ink)displays, LCD projectors, or other types of display devices, etc.

The client device 206 may be configured to execute various applicationssuch as a client application 245 and/or other applications. The clientapplication 245 may be executed in a client device 206, for example, toaccess network content served up by the computing environment 203 and/orother servers, thereby rendering a user interface 248 on the display242. To this end, the client application 245 may comprise, for example,a browser, a dedicated application, etc., and the user interface 248 maycomprise a network page, an application screen, etc. The client device206 may be configured to execute applications beyond the clientapplication 245 such as, for example, email applications, socialnetworking applications, and/or other applications.

The audio interface device 102 is representative of a plurality of audioor voice devices that may be coupled to the network 209. The audiointerface device 102 may comprise, for example, a processor-based systemsuch as a computer system. The audio interface device 102 may take theform of a standalone speaker device, a remote control device, a tabletcomputer, a smartphone, computing hardware integrated into a motorvehicle, or another type of client device. Some forms of the audiointerface device 102 may have a display, while other forms may not. Theaudio interface device 102 includes one or more audio input devices 272and one or more audio output devices 275. The audio input devices 272may comprise a microphone, a microphone-level audio input, a line-levelaudio input, or other types of input devices. The audio output device275 may comprise a speaker, a speaker output, a headphone output, aline-level audio output, or other types of output devices. In oneembodiment, the audio interface device 102 includes at least oneintegrated microphone and at least one integrated speaker within asingle enclosure.

The audio interface device 102 may also include a speech synthesizer 278and one or more client applications 281. The speech synthesizer 278 maybe configured to transform text inputs into speech for one or morelanguages using one or more standard voice profiles. The clientapplications 281 may enable functionality such as personal assistantfunctionality, home automation functionality, television controlfunctionality, music playback functionality, and/or other interactivefunctions. The client applications 281 may be configured to performnatural language processing and/or speech to text functions.

It is noted that voice recognition and processing functions may bedivided among the computing environment 203 and the audio interfacedevice 102. Thus, the authentication service 218 may be communicationwith an application implementing the server-side functionality of theaudio interface device 102. The server-side functions of the audiointerface device 102 may be on the same servers as the computingenvironment 203 if operated by the same entity or they may be on serversoperated by a different entity. The audio interface device 102 includesat least enough hardware or software in order to authenticate with oneor more servers in the computing environment 203 and to facilitatecommunication between the audio interface device 102 and the computingenvironment 203 in an authenticated context.

Additional examples of an audio interface device 102 may be found inU.S. patent application Ser. No. 14/456,620, entitled “VOICE APPLICATIONARCHITECTURE,” filed on Aug. 11, 2014, which was published as U.S.Patent Application Publication 2016/0042748 on Feb. 11, 2016; and inU.S. patent application Ser. 14/107,931, entitled “ATTRIBUTE-BASED AUDIOCHANNEL ARBITRATION,” filed on Dec. 16, 2016, which was published asU.S. Patent Application Publication 2015/0170665 on Jun. 18, 2015. Bothapplications and their respective publications are incorporated hereinby reference in their entirety.

Turning now to FIG. 3, shown is a flowchart that provides one example ofthe operation of a portion of the authentication service 218 accordingto various embodiments. It is understood that the flowchart of FIG. 3provides merely an example of the many different types of functionalarrangements that may be employed to implement the operation of theportion of the authentication service 218 as described herein. As analternative, the flowchart of FIG. 3 may be viewed as depicting anexample of elements of a method implemented in the computing environment203 (FIG. 2) according to one or more embodiments.

Beginning with box 303, the authentication service 218 receives anauthentication request from a first client device 206 (FIG. 2). Forexample, the first client device 206 may be a television or set-top box,or another device lacking a keyboard or touchscreen, and the user mayhave navigated to setup functionality using a remote control.Specifically, the user may have selected an option to authenticate orregister the first client device 206 using an existing account through asecond client device 206, such as an audio interface device 102.Alternatively, the user may have activated a button or other control ofthe first client device 206 that causes the authentication request to besent.

In box 306, the authentication service 218 generates an authenticationcode 107 (FIG. 2) in response to the authentication request.Alternatively, the first client device 206 may be configured to generatethe authentication code 107 and to report the authentication code 107 tothe authentication service 218. The authentication code 107 is generatedaccording to rules specified in the authentication code generation rules222 (FIG. 2). These rules may specify, for example, a required level ofentropy for the authentication code 107, including a number ofcharacters, a character set, a number of words, and/or a dictionary asmay be applicable. The authentication code 107 may be generatedrandomly. In one embodiment, the authentication code 107 may be checkedagainst other previously generated authentication codes 107 to ensureuniqueness. A time window 239 (FIG. 2) may be associated with thegenerated authentication code 107 and stored in the data store 215. Inaddition, the authentication request may specify a unique sessionidentifier 236 (FIG. 2) which may also be stored in the data store 215in association with the authentication code 107.

In box 309, the authentication service 218 sends the authentication code107 to the first client device 206 by way of the network 209 (FIG. 2).The first client device 206 may thus be configured to present theauthentication code 107 to the user by rendering it in a user interface248 (FIG. 2) on a display 242. In another example, the first clientdevice 206 may read out the authentication code 107 as audio via aspeaker. In still other examples, the first client device 206 maybroadcast the authentication code 107 as an ultrasonic signal (e.g.,above the limits of human hearing at approximately 20 kHz), as modulatedlight via a display 242, as a two-dimensional barcode such as a quickresponse (QR) code, as a broadcast message via a local computer network(e.g., WI-FI, BLUETOOTH, Ethernet, etc.).

While the authentication code 107 is being presented, the second clientdevice 206 may be in a listening mode, or a user may cause the secondclient device 206 to enter a listening mode, e.g., by pressing a buttonor saying a wake word. The user may then say the authentication code 107that is presented by the first client device 206, or the authenticationcode 107 may be broadcast by the first client device 206. The secondclient device 206 receives the authentication code 107 via anenvironmental sensor such as an audio input device 272 (FIG. 2).

The second client device 206 may solicit a user approval via speechgenerated by the speech synthesizer 278 and presented by the audiooutput device 275. This user approval may be provided in terms of averbal confirmation, a physical gesture (e.g., a user performing athumbs up gesture in front of a video sensor of the second client device206), a button press on the second client device 206, or some otheraction. In box 312, the authentication service 218 receives theauthentication code 107 from the second client device 206. The secondclient device 206 is already authenticated for access to the account. Insome embodiments, the authentication service 218 may perform a speakeror voice identification on audio captured by the second client device206 to verify that the speaker matches the voice recognition profile 234(FIG. 2) associated with the account.

In box 315, the authentication service 218 begins a series ofverifications, including determining whether the authentication code 107received from the second client device 206 matches the priorauthentication code 107 presented via the first client device 206. Ifthere is not a match, the authentication service 218 moves to box 318and denies the authentication request of the first client device 206.Thereafter, the operation of the portion of the authentication service218 ends.

Otherwise, if the authentication code 107 received from the secondclient device 206 matches the prior authentication code 107, theauthentication service 218 continues from box 315 to box 319. In box319, the authentication service 218 determines whether theauthentication code 107 is received from the second client device 206within a time window 239 (FIG. 2) for validity. If the authenticationcode 107 is not received within the time window 239, the authenticationservice 218 moves to box 318 and denies the authentication request ofthe first client device 206. Thereafter, the operation of the portion ofthe authentication service 218 ends.

Otherwise, if the authentication code 107 is received within the timewindow 239, the authentication service 218 continues from box 319 to box320. In box 320, the authentication service 218 may determine whether anapproval has been received from an authorized user. For example, theauthentication service 218 may also verify that the voice speaking theauthentication code 107 or a verbal confirmation matches the voicerecognition profile 234 of the authorized user. If an approval is notreceived, or if the approval is not received from the authorized user,the authentication service 218 may move to box 318 and deny theauthentication request of the first client device 206. Thereafter, theoperation of the portion of the authentication service 218 ends.

Otherwise, if an approval is received from an authorized user, theauthentication service 218 continues from box 320 to box 321. In box321, the authentication service 218 may confirm that any risk-basedfactors 223 (FIG. 2) do not weigh against authentication. For example,authentication may be denied if the first client device 206 is in anunusual or risky geographic area, or if a multiplicity of authenticationrequests are received from the first client device 206, or if otherrisk-based factors 223 weigh against authentication. If risk-basedfactors 223 weigh against authentication, the authentication service 218may move to box 318 and deny the authentication request of the firstclient device 206. Thereafter, the operation of the portion of theauthentication service 218 ends.

Otherwise, if risk-based factors 223 do not weigh againstauthentication, the authentication service 218 continues from box 321 tobox 324. In box 324, the authentication service 218 determines anaccount associated with the second client device 206. In box 327, theauthentication service 218 authenticates the first client device 206 foraccess to the same account. For example, the authentication service 218may determine the session identifier 236 associated with the validatedauthentication code 107 and approve the corresponding session and firstclient device 206 to be given an access token 227 for the account. Forexample, the authentication service 218 may issue a registration tokento the first client device 206 for access to the account. Thereafter,the operation of the portion of the authentication service 218 ends.

Referring next to FIG. 4, shown is a flowchart that provides one exampleof the operation of a portion of the authentication service 218according to various embodiments. It is understood that the flowchart ofFIG. 4 provides merely an example of the many different types offunctional arrangements that may be employed to implement the operationof the portion of the authentication service 218 as described herein. Asan alternative, the flowchart of FIG. 4 may be viewed as depicting anexample of elements of a method implemented in the computing environment203 (FIG. 2) according to one or more embodiments.

Beginning with box 403, the authentication service 218 receives atransaction authorization request from a first client device 206 (FIG.2). For example, the first client device 206 may be a television,set-top box, or another device that lacks a keyboard or touchscreen. Theuser may have navigated to secured content, such as movie, that requirespayment for purchase. Alternatively, the user may have activated abutton or other control of the first client device 206 that causes thetransaction authorization request to be sent.

In box 406, the authentication service 218 generates an authorizationcode in response to the transaction authorization request.Alternatively, the first client device 206 may be configured to generatethe authorization code and to report the authorization code to theauthentication service 218. The authorization code is generatedaccording to rules specified in the authentication code generation rules222 (FIG. 2). These rules may specify, for example, a required level ofentropy for the authorization code, including a number of characters, acharacter set, a number of words, and/or a dictionary as may beapplicable. The authorization code may be generated randomly. In oneembodiment, the authorization code may be checked against otherpreviously generated authorization code to ensure uniqueness. A timewindow 239 (FIG. 2) may be associated with the generated authorizationcode and stored in the data store 215. In addition, the authorizationrequest may specify a unique session identifier 236 (FIG. 2) which mayalso be stored in the data store 215 in association with theauthorization code.

In box 409, the authentication service 218 sends the authorization codeto the first client device 206 by way of the network 209 (FIG. 2). Thefirst client device 206 may thus be configured to present theauthorization code to the user by rendering it in a user interface 248(FIG. 2) on a display 242. In another example, the first client device206 may read out the authorization code as audio via a speaker using aspeech synthesizer 278. In still other examples, the first client device206 may broadcast the authorization code as an ultrasonic signal (e.g.,above the limits of human hearing at approximately 20 kHz), as modulatedlight via a display 242, as a broadcast message via a local networkinterface of the first client device 206 to a local computer network(e.g., WI-FI, BLUETOOTH, Ethernet, etc.).

While the authorization code is being presented, the second clientdevice 206 may be in a listening mode, or a user may cause the secondclient device 206 to enter a listening mode, e.g., by pressing a buttonor saying a wake word. The user may then say the authorization code thatis presented by the first client device 206, or the authorization codemay be broadcast by the first client device 206. The second clientdevice 206 receives the authorization code via an environmental sensorsuch as an audio input device 272 (FIG. 2).

The second client device 206 may solicit a user approval via speechgenerated by the speech synthesizer 278 and presented by the audiooutput device 275. In box 412, the authentication service 218 receivesthe authorization code from the second client device 206. The secondclient device 206 is already authenticated for access to an accountcapable of approving the transaction. In some embodiments, theauthentication service 218 may perform a speaker or voice identificationon audio captured by the second client device 206 to verify that thespeaker matches the voice recognition profile 234 (FIG. 2) associatedwith the account, and that the speaker has permission to approve thepending transaction.

In box 415, the authentication service 218 begins a series ofverifications, including determining whether the authorization codereceived from the second client device 206 matches the priorauthorization code presented via the first client device 206. If thereis not a match, the authentication service 218 moves to box 418 anddenies the transaction authorization request of the first client device206. Thereafter, the operation of the portion of the authenticationservice 218 ends.

Otherwise, if the authorization code received from the second clientdevice 206 matches the prior authorization code, the authenticationservice 218 continues from box 415 to box 419. In box 419, theauthentication service 218 determines whether the authorization code isreceived from the second client device 206 within a time window 239(FIG. 2) for validity. If the authorization code is not received withinthe time window 239, the authentication service 218 moves to box 418 anddenies the transaction authorization request of the first client device206. Thereafter, the operation of the portion of the authenticationservice 218 ends.

Otherwise, if the authorization code is received within the time window239, the authentication service 218 continues from box 419 to box 420.In box 420, the authentication service 218 may determine whether anapproval has been received from an authorized user. For example, theauthentication service 218 may also verify that the voice speaking theauthentication code 107 or a verbal confirmation matches the voicerecognition profile 234 of the authorized user. If an approval is notreceived, or if the approval is not received from the authorized user,the authentication service 218 may move to box 418 and deny thetransaction authorization request of the first client device 206.Thereafter, the operation of the portion of the authentication service218 ends.

Otherwise, if an approval is received from an authorized user, theauthentication service 218 continues from box 420 to box 421. In box421, the authentication service 218 may confirm that any risk-basedfactors 223 (FIG. 2) do not weigh against authorizing the transaction.For example, transaction authorization may be denied if the first clientdevice 206 is in an unusual or risky geographic area, or if amultiplicity of requests are received from the first client device 206,or if other risk-based factors 223 weigh against authorization. Ifrisk-based factors 223 weigh against transaction authorization, theauthentication service 218 may move to box 418 and deny the transactionauthorization request of the first client device 206. Thereafter, theoperation of the portion of the authentication service 218 ends.

Otherwise, if risk-based factors 223 do not weigh against transactionauthorization, the authentication service 218 continues from box 421 tobox 424. In box 424, the authentication service 218 determines thepending transaction associated with the client device 206. In box 427,the authentication service 218 authorizes the transaction using paymentsor resources of an account associated with the second client device 206.For example, the authentication service 218 may cause a paymentinstrument (e.g., a bank account or credit card) associated with theaccount to be charged in order to authorize the transaction. Thereafter,the operation of the portion of the authentication service 218 ends.

Continuing to FIG. 5, shown is a flowchart that provides one example ofthe operation of a portion of the client device 206 according to variousembodiments. It is understood that the flowchart of FIG. 5 providesmerely an example of the many different types of functional arrangementsthat may be employed to implement the operation of the portion of theclient device 206 as described herein. As an alternative, the flowchartof FIG. 5 may be viewed as depicting an example of elements of a methodimplemented in the client device 206 according to one or moreembodiments.

Beginning with box 503, the client device 206 receives a user request toaccess resources of an existing account with an identity provider. Forexample, the user may select a button or other component on a userinterface 248 that is associated with launching an authenticationrequest for the identity provider. In box 506, the client device 206sends an authentication request to the authentication service 218 (FIG.2) via the network 209 (FIG. 2). The authentication request can includea session identifier 236 (FIG. 2). In one embodiment, the request maylack specification of a particular account, where the account may beinferred through the client device that is used to approve theauthentication. In box 509, the client device 206 receives anauthentication code 107 (FIG. 2) from the authentication service 218.Alternatively, the client device 206 may generate an authentication code107 and inform the authentication service 218.

In box 512, the client device 206 presents the authentication code 107.In various examples, this may entail showing the authentication code 107on the display 242 (FIG. 2), reading out the authentication code 107 viaa speech synthesizer 278, modulating an ultrasonic signal or lightsignal with the authentication code 107, broadcasting the authenticationcode 107 via a local computer network, and/or other approaches. In somecases, the client device 206 may generate a wake signal in order to wakean audio interface device 102 (FIG. 2), so that the audio interfacedevice 102 will enter a listening mode to recognize the authenticationcode 107 being broadcast. The authentication code 107 may be presentedmultiple times in a looping fashion. Delimiters and error correctioncoding may be used to ensure integrity of the transmission of theauthentication code 107. For example, a cyclic redundancy check (CRC)may be performed on the authentication code 107 to verify its integrity.

In box 515, the client device 206 determines whether the authenticationrequest was approved. For example, the authentication service 218 maycommunicate to the client device 206 that the request was approved ordenied. Alternatively, lack of communication may indicate that therequest was denied. If the request was not approved, the client device206 may move from box 515 to box 518 and inform the user that theauthentication has failed. Thereafter, the operation of the portion ofthe client device 206 ends.

Otherwise, if the authentication was succeeded, the client device 206moves from box 515 to box 521 and receives an access token 227 (FIG. 2)from the authentication service 218. In box 524, the client device 206informs the user that the authentication was successful and accesses oneor more secured resources 230 (FIG. 2) of the account using the accesstoken 227. Thereafter, the operation of the portion of the client device206 ends.

Referring next to FIG. 6, shown is a flowchart that provides one exampleof the operation of a portion of the audio interface device 102according to various embodiments. It is understood that the flowchart ofFIG. 6 provides merely an example of the many different types offunctional arrangements that may be employed to implement the operationof the portion of the audio interface device 102 as described herein. Asan alternative, the flowchart of FIG. 6 may be viewed as depicting anexample of elements of a method implemented in the audio interfacedevice 102 according to one or more embodiments.

Beginning with box 603, the audio interface device 102 detects a wakesignal. This may correspond to a user saying a wake word or phrase,another device emitting a predefined sound or signal, or a useractivating a button on the audio interface device 102. In someembodiments, the authentication code 107 may correspond to a wakesignal. In other embodiments, a wake signal may be unnecessary, as theaudio interface device 102 may always be in an active listening mode. Inbox 606, the audio interface device 102 enters an active listening modevia one or more environmental sensors, such as an audio input device 272(FIG. 2). In box 609, the audio interface device 102 receives anauthentication code 107 (FIG. 2), either by way of the user dictatingthe authentication code 107 or through a broadcast by the client device206 (FIG. 2) that is requesting authentication. For example, the audiointerface device 102 may receive the authentication code 107 as speechrecorded via a microphone of the audio interface device 102.

In box 612, the audio interface device 102 may confirm the identity ofthe user and/or obtain a user approval of authentication. For example,the audio interface device 102 may perform a speakerrecognition/identification procedure with reference to one or more knownvoice recognition profiles 234 (FIG. 2) to ensure that the user who isdictating the authentication code 107 and/or giving approval haspermission. If the audio interface device 102 detects a broadcast of anauthentication code 107 by a client device 206, the audio interfacedevice 102 may request explicit approval from the user before proceedingwith the authentication.

In one embodiment, multiple users potentially associated with multipleuser accounts may be at a location 235. In such a case, the processingperformed by the audio interface device 102 or at the backend by thecomputing environment 203 may identify an account out of multiplepotential accounts corresponding to a particular speaker who has beenidentified as giving an approval or stating an authentication code 107.For example, the presence of a user at a particular location 235 may beregistered by a mobile device of the user with the computing environment203, and the computing environment 203 could include the correspondingvoice recognition profile 234 of the user as a possibility whenidentifying speech at a geographic area surrounding the location 235.

In box 615, the audio interface device 102 sends the authentication code107 and/or the audio containing the authentication code 107 to theauthentication service 218 for processing. In box 618, the audiointerface device 102 determines whether the authentication service 218reports that the authentication has been approved. If authentication hasnot been approved, the audio interface device 102 moves from box 618 tobox 621 and informs the user that the authentication has failed.Thereafter, the operation of the portion of the audio interface device102 ends.

If the authentication has been approved, the audio interface device 102instead proceeds from box 618 to box 624 and informs the user that theauthentication has succeeded. The audio interface device 102 may give anindication that identifies the client device 206 that has beenauthenticated (e.g., a brand name and model number of a television).Thereafter, the operation of the portion of the audio interface device102 ends.

Although the foregoing flowcharts of FIGS. 3-6 represent theauthentication and transaction approval as being Boolean decisions, itis understood that additional logic may be present to handle situationsof uncertainty. Uncertain voice identification or elevated risk factorsmay be mitigated through additional processes. As an example, atransaction intent verification may be pushed to a mobile deviceassociated with the account. As another example, additional voicesamples may be solicited from the user, e.g., by asking the user torepeat words.

With reference to FIG. 7, shown is a schematic block diagram of thecomputing environment 203 according to an embodiment of the presentdisclosure. The computing environment 203 includes one or more computingdevices 700. Each computing device 700 includes at least one processorcircuit, for example, having a processor 703 and a memory 706, both ofwhich are coupled to a local interface 709. To this end, each computingdevice 700 may comprise, for example, at least one server computer orlike device. The local interface 709 may comprise, for example, a databus with an accompanying address/control bus or other bus structure ascan be appreciated.

Stored in the memory 706 are both data and several components that areexecutable by the processor 703. In particular, stored in the memory 706and executable by the processor 703 is the authentication service 218and potentially other applications. Also stored in the memory 706 may bea data store 215 and other data. In addition, an operating system may bestored in the memory 706 and executable by the processor 703.

It is understood that there may be other applications that are stored inthe memory 706 and are executable by the processor 703 as can beappreciated. Where any component discussed herein is implemented in theform of software, any one of a number of programming languages may beemployed such as, for example, C, C++, C#, Objective C, Java®,JavaScript®, Perl, PHP, Visual Basic®, Python®, Ruby, Flash®, or otherprogramming languages.

A number of software components are stored in the memory 706 and areexecutable by the processor 703. In this respect, the term “executable”means a program file that is in a form that can ultimately be run by theprocessor 703. Examples of executable programs may be, for example, acompiled program that can be translated into machine code in a formatthat can be loaded into a random access portion of the memory 706 andrun by the processor 703, source code that may be expressed in properformat such as object code that is capable of being loaded into a randomaccess portion of the memory 706 and executed by the processor 703, orsource code that may be interpreted by another executable program togenerate instructions in a random access portion of the memory 706 to beexecuted by the processor 703, etc. An executable program may be storedin any portion or component of the memory 706 including, for example,random access memory (RAM), read-only memory (ROM), hard drive,solid-state drive, USB flash drive, memory card, optical disc such ascompact disc (CD) or digital versatile disc (DVD), floppy disk, magnetictape, or other memory components.

The memory 706 is defined herein as including both volatile andnonvolatile memory and data storage components. Volatile components arethose that do not retain data values upon loss of power. Nonvolatilecomponents are those that retain data upon a loss of power. Thus, thememory 706 may comprise, for example, random access memory (RAM),read-only memory (ROM), hard disk drives, solid-state drives, USB flashdrives, memory cards accessed via a memory card reader, floppy disksaccessed via an associated floppy disk drive, optical discs accessed viaan optical disc drive, magnetic tapes accessed via an appropriate tapedrive, and/or other memory components, or a combination of any two ormore of these memory components. In addition, the RAM may comprise, forexample, static random access memory (SRAM), dynamic random accessmemory (DRAM), or magnetic random access memory (MRAM) and other suchdevices. The ROM may comprise, for example, a programmable read-onlymemory (PROM), an erasable programmable read-only memory (EPROM), anelectrically erasable programmable read-only memory (EEPROM), or otherlike memory device.

Also, the processor 703 may represent multiple processors 703 and/ormultiple processor cores and the memory 706 may represent multiplememories 706 that operate in parallel processing circuits, respectively.In such a case, the local interface 709 may be an appropriate networkthat facilitates communication between any two of the multipleprocessors 703, between any processor 703 and any of the memories 706,or between any two of the memories 706, etc. The local interface 709 maycomprise additional systems designed to coordinate this communication,including, for example, performing load balancing. The processor 703 maybe of electrical or of some other available construction.

Although the authentication service 218 and other various systemsdescribed herein may be embodied in software or code executed by generalpurpose hardware as discussed above, as an alternative the same may alsobe embodied in dedicated hardware or a combination of software/generalpurpose hardware and dedicated hardware. If embodied in dedicatedhardware, each can be implemented as a circuit or state machine thatemploys any one of or a combination of a number of technologies. Thesetechnologies may include, but are not limited to, discrete logiccircuits having logic gates for implementing various logic functionsupon an application of one or more data signals, application specificintegrated circuits (ASICs) having appropriate logic gates,field-programmable gate arrays (FPGAs), or other components, etc. Suchtechnologies are generally well known by those skilled in the art and,consequently, are not described in detail herein.

The flowcharts of FIGS. 3-6 show the functionality and operation of animplementation of portions of the authentication service 218, the clientdevice 206, and the audio interface device 102. If embodied in software,each block may represent a module, segment, or portion of code thatcomprises program instructions to implement the specified logicalfunction(s). The program instructions may be embodied in the form ofsource code that comprises human-readable statements written in aprogramming language or machine code that comprises numericalinstructions recognizable by a suitable execution system such as aprocessor 703 in a computer system or other system. The machine code maybe converted from the source code, etc. If embodied in hardware, eachblock may represent a circuit or a number of interconnected circuits toimplement the specified logical function(s).

Although the flowcharts of FIGS. 3-6 show a specific order of execution,it is understood that the order of execution may differ from that whichis depicted. For example, the order of execution of two or more blocksmay be scrambled relative to the order shown. Also, two or more blocksshown in succession in FIGS. 3-6 may be executed concurrently or withpartial concurrence. Further, in some embodiments, one or more of theblocks shown in FIGS. 3-6 may be skipped or omitted. In addition, anynumber of counters, state variables, warning semaphores, or messagesmight be added to the logical flow described herein, for purposes ofenhanced utility, accounting, performance measurement, or providingtroubleshooting aids, etc. It is understood that all such variations arewithin the scope of the present disclosure.

Also, any logic or application described herein, including theauthentication service 218, that comprises software or code can beembodied in any non-transitory computer-readable medium for use by or inconnection with an instruction execution system such as, for example, aprocessor 703 in a computer system or other system. In this sense, thelogic may comprise, for example, statements including instructions anddeclarations that can be fetched from the computer-readable medium andexecuted by the instruction execution system. In the context of thepresent disclosure, a “computer-readable medium” can be any medium thatcan contain, store, or maintain the logic or application describedherein for use by or in connection with the instruction executionsystem.

The computer-readable medium can comprise any one of many physical mediasuch as, for example, magnetic, optical, or semiconductor media. Morespecific examples of a suitable computer-readable medium would include,but are not limited to, magnetic tapes, magnetic floppy diskettes,magnetic hard drives, memory cards, solid-state drives, USB flashdrives, or optical discs. Also, the computer-readable medium may be arandom access memory (RAM) including, for example, static random accessmemory (SRAM) and dynamic random access memory (DRAM), or magneticrandom access memory (MRAM). In addition, the computer-readable mediummay be a read-only memory (ROM), a programmable read-only memory (PROM),an erasable programmable read-only memory (EPROM), an electricallyerasable programmable read-only memory (EEPROM), or other type of memorydevice.

Further, any logic or application described herein, including theauthentication service 218, may be implemented and structured in avariety of ways. For example, one or more applications described may beimplemented as modules or components of a single application. Further,one or more applications described herein may be executed in shared orseparate computing devices or a combination thereof. For example, aplurality of the applications described herein may execute in the samecomputing device 700, or in multiple computing devices 700 in the samecomputing environment 203.

Disjunctive language such as the phrase “at least one of X, Y, or Z,”unless specifically stated otherwise, is otherwise understood with thecontext as used in general to present that an item, term, etc., may beeither X, Y, or Z, or any combination thereof (e.g., X, Y, and/or Z).Thus, such disjunctive language is not generally intended to, and shouldnot, imply that certain embodiments require at least one of X, at leastone of Y, or at least one of Z to each be present.

It should be emphasized that the above-described embodiments of thepresent disclosure are merely possible examples of implementations setforth for a clear understanding of the principles of the disclosure.Many variations and modifications may be made to the above-describedembodiment(s) without departing substantially from the spirit andprinciples of the disclosure. All such modifications and variations areintended to be included herein within the scope of this disclosure andprotected by the following claims.

Therefore, the following is claimed:
 1. A method, comprising: causing,via at least one of one or more computing devices, a client device topresent an authentication code via an output device of the clientdevice; receiving, via at least one of the one or more computingdevices, the authentication code from a voice interface device, whereinthe voice interface device is in an authenticated state for access to anaccount, and the voice interface device received the authentication codefrom speech captured by a microphone of the voice interface devicefollowing a spoken wake word; and authenticating, via at least one ofthe one or more computing devices, the client device for access to theaccount in response to determining that the authentication code receivedfrom the voice interface device matches the authentication codepresented by the client device.
 2. The method of claim 1, wherein theclient device lacks both a touchscreen and a keyboard.
 3. The method ofclaim 1, wherein the voice interface device lacks both a touchscreen anda keyboard.
 4. The method of claim 1, wherein the voice interface deviceis associated with a fixed location.
 5. The method of claim 1, furthercomprising: determining, via at least one of the one or more computingdevices, that a risk-based factor does not preclude authentication ofthe client device for access to the account; and issuing, via at leastone of the one or more computing devices, a registration token to theclient device for access to the account.
 6. The method of claim 1,further comprising: randomly generating, via at least one of the one ormore computing devices, the authentication code based at least in parton a required level of entropy and a time window for validity; andsending, via at least one of the one or more computing devices, theauthentication code to the client device via a network.
 7. The method ofclaim 1, wherein authenticating the client device for access to theaccount further comprises: performing, via at least one of the one ormore computing devices, voice identification on the speech; anddetermining, via at least one of the one or more computing devices, thatan identified voice matches a voice profile of a user associated withthe account.
 8. The method of claim 1, wherein authenticating the clientdevice for access to the account further comprises determining, via atleast one of the one or more computing devices, that the authenticationcode is valid for the account according to a time window for validityfor the authentication code.
 9. The method of claim 1, wherein theoutput device of the client device is a local network interface, and theclient device presents the authentication code by sending broadcast datavia the local network interface.
 10. The method of claim 1, wherein theoutput device of the client device is a display, and the client devicepresents the authentication code by showing the authentication code onthe display.
 11. The method of claim 1, wherein the output device of theclient device is a speaker, and the client device presents theauthentication code by emitting an audio signal via the speaker.
 12. Themethod of claim 11, wherein the audio signal encodes the authenticationcode in an ultrasonic frequency range.
 13. The method of claim 11,wherein the audio signal encodes the authentication code as a voicegenerated by a voice synthesizer.
 14. A system, comprising: at least onecomputing device; and instructions executable in the at least onecomputing device, wherein when executed the instructions cause the atleast one computing device to at least: cause a client device to presentan authentication code via a display of the client device; receive theauthentication code from a voice interface device, wherein the voiceinterface device is in an authenticated state for access to an account,and the voice interface device received the authentication code asspeech recorded via a microphone of the voice interface device inresponse to an entrance of the voice interface device into a listeningmode following detection of an audio wake signal; and authenticate theclient device for access to the account in response to determining thatthe authentication code received from the voice interface device matchesthe authentication code presented by the client device.
 15. The systemof claim 14, wherein the audio wake signal corresponds to a spoken wakeword.
 16. The system of claim 14, wherein when executed the instructionsfurther cause the at least one computing device to at least randomlygenerate the authentication code.
 17. The system of claim 14, whereinthe account is unspecified by the client device.
 18. The system of claim14, wherein when executed the instructions further cause the at leastone computing device to at least authorize a transaction associated withthe client device in response to determining that the authenticationcode received from the voice interface device matches the authenticationcode presented by the client device.
 19. A non-transitorycomputer-readable medium embodying a program executable in at least onecomputing device, wherein when executed the program causes the at leastone computing device to at least: cause a client device to present anauthentication code via a display of the client device; receive theauthentication code from a voice interface device, wherein the voiceinterface device is in an authenticated state for access to an account,and the voice interface device received the authentication code asspeech recorded via a microphone of the voice interface device inresponse to an entrance of the voice interface device into a listeningmode following detection of an audio wake signal; and authenticate theclient device for access to the account in response to determining thatthe authentication code received from the voice interface device matchesthe authentication code presented by the client device.
 20. Thenon-transitory computer-readable medium of claim 19, wherein whenexecuted the program further causes the at least one computing device toat least authorize a transaction associated with the client device inresponse to determining that the authentication code received from thevoice interface device matches the authentication code presented by theclient device.